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Bayesian  Analysis  of  Reduced  Form  Systems 

By  Albert  Ando   and  G.M.    Kaufman 

SUMMARY 

Under  the  assumption  that  none  of  the  parameters  of  a  reduced  form 
system  are  known  with  certainty,  the  natural  conjugate  family  of 
prior  densities  for  the  joint  distribution  of  these  parameters  is 
identified.   Prior-posterior  and  preposterior  analysis  is  done 
assuming  that  the  prior  is  in  the  natural  conjugate  family,  and 
some  useful  sampling  distributions  are  derived.   A  procedure  is 
presented  for  obtaining  some  non-degenerate  joint  posterior  and 
preposterior  distributions  of  all  parameters  even  when  the  number 
of  objective  vector  sample  observations  is  less  than  the  number 
of  parameters  of  the  process. 

1.    Introduction 

The  data  generating  process  known  as  a  simultaneous  equations  system  among 
econometricians  may  be  described  in  simplified  form  as  follows:  it  is  a  set  of 
stochastic  equations 

B  2^j^  +  r  z^j^  =  u^J^    ,     j=l,2,...  (la) 

where  B  and  r  are  (m  x  m)  and  (m  x  r)  coefficient  matrices,  fixed  for  all  j, 
£^-''^  is  a  (  r  X  1)  vector  of  predetermined  variables  and  ^    ^'^'^  u^-^^    are  (m  x  1) 
and  (r  x  1)  random  vectors  respectively.   It  is  often  assumed  that  {u^-'  ,  j=l,2,...} 
is  a  sequence  of  mutually  independent,  identically  multivariate  Normally  distrib- 


addition  it  is  assumed  that  B  is  non-singular,  there  exists  a  wide  variety  of 
methods  for  estimating  B  and  r. 

Dreze  [  3  ]  has  suggested  that  Bayesian  methods  be  applied  to  the  analysis 
of  such  systems  when  neither  B,  r,  nor  h  is  known  with  certainty.   In  [  3  ]  he 
outlines  some  ways  of  treating  the  system  when  h,  but  not  B  and  £,  is  known  with 
certainty.   Zellner  and  Xiao  [  6  ]  treat  the  reduced  form  system  as  defined  in 


-  2  - 

section  1.1  below  with  parameters  h  and  g  s  -b'  ^   from  a  Bayesian  point  of  view. 
They,  in  effect,  do  prior-posterior  analysis  of  the  data  generating  process  defined 
in  (lb)  and  derive  the  unconditional  distribution  of  the  2.         under  the  assumption 
that  (n,  h)  has  a  particular  diffuse  (and  degenerate)  prior  density. 

In  this  paper  we  identify  a  natural  conjugate  family  for  (g,  ^) ,  do  prior- 
posterior  analysis  under  the  broader  assumption  that  the  prior  is  in  the  natural 
conjugate  family,  present  sampling  distributions  unconditional  as  regards  (g,  K) , 
and  do  preposterior  analysis.   In  addition,  we  show  how  Bayesian  inference  can  be 
done  even  when  the  number  of  objective  sample  vector  observations  is  less  than  the 
number  of  unknown  parameters. 


1.   Definition  of  the  Process 

The  reduced  form  data  generating  process  is  defined  as  one  that  generates 

~(1)     ~(i) 
independent  (m  x  1)  random  vectors  ^   , . . .  ,^   , . . .  according  to  the  model 


;(j) 


,(j)  ^~(J) 


^•--  =  n  z^"'  +  v^-^'   ,  (lb) 

where  ^--a  matrix  of  dimension  (m  x  r)--is  a  parameter  whose  value  remains  fixed. 
Initially  we  assume  that  z^^^  is  a  known  (r  x  1)  vector  which  varies  from  obser- 
vation to  observation.   The  v^-'-'s  are  mutually  independent  (m  x  1)  random  vectors 


identically  distributed  according  to 


^'"^(ZIO.  h)  =  (2k)-2' 
'(J) 


1  t,      1 

■2l    hv  |^J2 


(2) 


and  so  the  density  of  ^   '      is 

Notice  that  if  B  in  (la)  is  non-singular  then  preraultiplying  both  sides  of 
(la)  by  b"   transforms  (la)  into  the  form  (lb)  with  n  =  -b"  r  and  v^"^  =b"''"u^'^\ 


1.1   Some  Definitions 

For  future  convenience,  we  make  the  following  definitions; 

dim(m  x  n) 


dim(r  x  n) 


n  =  [it 


dim(m  x  r) 


(3a) 


(3b) 


(3c) 
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Later,  we  shall  have  use  for  the  matrix 

P  =  (£<'', ...,£*'^']  =      :    I   ,    <Hm(in  X  r)  .  (3d) 

V  '^ 

1.2  Likelihood  of  a  Sample 

The  likelihood  that  the  process  defined  in  (1)  will  generate  n  successive 
values  y^  y  "  •  jY/^    )  "  - 1"^        is  the  product  of  the  individual  likelihoods: 


(j)_„  .(j)^t^,,.(j).n  ,(J)^ 


(2,)4"-e-i^(r'  -Sr-")l(r'  -n£'-^')|h|2"   .  (4) 

If  the  process  by  which  the  z^-*  s  were  generated  is  non-informative  then  this 

is  the  likelihood  of  the  sample  described  by  (Y,  Z) .   The  kernel  of  the  likelihood 

^'^  "    ^4^(Z^J>-nz(J))^h(^<J>-5z<J>)  ,,,in   _  ^3^ 

Given  (Y,  Z)  we  may  compute  these  statistics: 

V=  E  z(J)z(J)  =  I  l'^    ,        dim(r  x  r) ,  v  =  n-m  (redundant), 
l^   =  (|  z'^)"-^  I   l^  ,    dim(r  X  m) , 

,a)  ^  y(J).p_  ,(j)       ^    dim(mxl),  ^'^ 

I  s  E  e^^J^e^J)  ,    dim(m  x  m) . 

In  terms  of  V,  P,  and  e  we  may  write  (5)  as 

^-itr  h{[P-n]V[P-n]V|}  |h|i(v-Hn)    _  ^^^ 

It  is  well  known  thatl  (7)  is  the  kernel  of  the  joint  likelihood  of  (P,  e) 
when  V  >  0  and  y  is  PDS  and  that  given  V,  n,  and  h,  P  and  e  are  independent. 
It  follows  that  the  marginal  likelihood  of  e  is  Wishart  with  parameter  (h,  v) 


'"see  T.W.  Anderson  [1 


[1  ],  p.  183. 
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provided  v  >  0^  and  the  marginal  likelihood  of  P  multivariate  Normal  with 

kernel 

^-Atr(h[(P-2)V(P-a)'^]}  |,^|-|-   ^  ^g^ 

We  define 

£=  i2.i,'",Pj         ,      dim(mr  x  1)  ,  (9a) 

n  =  (Tt^,...,£^)^  ,      dim(mr  x  1)  ,  (9b) 

where  £.  is  the  ith  row  of  P  and  j:.  the  ith  row  of  H,  and 

H  =  ^  a  Y  (10) 

where  Q  denotes  the  Kroenecker  direct  product  of  h  and  V.   Then  the  kernel 

(7)  of  the  joint  likelihood  of  (P,  e)  may  be  written  as 

e-i(P-i)''h  a  y(£-n)  n^ji  ^  g4tr  h  |  n^||(v-Hn-l)   ^  ^^^^ 

The  kernel  of  the  marginal  likelihood  of  £  is 

e-i(£-l) '»(£-£)  |h|i    .  (12) 

Formula  (11)  is  simply  a  rearrangement--not  a  transformation--of  elements 
in  the  exponent  of  (9),  as  may  be  verified  by  writing  out  the  trace  in  (8). 

That  the  marginal  likelihood  of  p  is  (12)  follows  from  the  fact  that  (12) 
is  just  a  rearrangement  of  (9). 

When  the  rank  q  of  Z  is  less  than  the  column  dimension  r  of  IT  the  statistic 
P  is  not  fully  determined.   In  fact^  m(r-q)  of  the  mr  elements  of  P  may  be 
assigned  arbitrary  values,  whereupon  the  remaining  elements  of  P  are  determined 
by  the  normal  equations 

(Z  Z^)V^   =  if      .  (13) 

To  facilitate  discussion,  we  assume  that  the  first  q  rows  of  Z  are  linearly 
independent.   Then  we  may  partition  Z  as 


where 
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Z^    is  (q  X  n), 

Z   is  ((r-q)  X  n) 


and  P  as 


£  =  [?1  ?2-^  »  where  Pj^  is  (m  x  q)  and  £2  ^^  (m  x  (r-q)). 


This  allows  us  to  write  (13)  as 


h  ll 

il  tl 

"if 

il 

lltl 

ll  tl 

A 

.'=2 

(14) 


(15) 


(16) 


The  projection  of  each  ^. ,  i=1^2,...,m  on  the  q-dimensional  row  space  of  Z  is 

unique  even  though  there  are  an  infinite  number  of  linear  combinations  of  the 

m  rows  of  Z  in  terms  of  which  these  projections  may  be  expressed.   We  may 

express  the  projection  of  ^. ,    i=l,2^...,m  on  this  space  as  a  set  of  unique 

linear  combinations  P,  by  arbitrarily  specifying,  for  example,  that  Pn=0. 

As  Z,Zi  is  of  rank  q,  P,  has  a  unique  value 
_t 


=1  =r 


(17) 


=1 

By  defining 

I   =  [gi  0]  ,  (18) 

we  have  E  Z  =  Pj^  1^^  so  that  P  as  defined  in  (18)  satisfies  P  as  defined  in 

(6). 

The  definitions  (6)  of  ^^  ,  V,  and  e  now  go  through  without  change, 
although  6  will  be  singular  if  n  <  m. 

In  order  to  exploit  the  information  in  (P,  V,  e)  even  when  v  <  0,  or 

q  <  r  or  both  we  define 

(  m-1     V  <  0      fe  n  >  m 

*=)     if        ,|*J    if 

(0      v>0     (0      n<m 


and  write  (11)  as 


g-i(p-«)'^(h  a  V)(p-£)  |j^|i6  ^-|tr  h  I*  n^ji(v+m-0-l)  _  ^^2) 

Notice  that  even  if  q  <  r,  the  joint  likelihood  of  (P,,  |)  -    (p ,  ,  e)  may  exist. 

1.2  Conjugate  Distribution  of  (n,  h) ,  n,  and  h 

When  neither  n  nor  h  are  known  with  certainty  but  are  regarded  as  random 
variables,  the  natural  conjugate  of  (12)  is 

^^''^(i^  ^Jl'  I'    I'  ^> 
defined  as  equal  to 

k(m,r,v)  e-^'^^  b[  (n-g)  V(n-E)*^]  ^^^\^   ^-\tr   |*h  |  j^j^v-l  j  ^,|i(v4^-l)   ^^3^^ 


^N"'^'*(Sli^^  ^  V    ^'"^(^ll^v)  if  V  >  0  and  ^  and  e  are  PDS, 


(13b) 


otherwise 


where 


k(n>,r,v)  .  [2^(v+r-Hn-l)  ^rn(n,+2r-l)/4  J^  r(Kv+n,-i])  J'^  (13c) 

The  conjugate  family  defined  by  (13)  parallels  that  defined  in  (6a)  of  [  2  ] 
for  the  Multinormal  process. 

We  obtain  the  marginal  prior  on  g  by  integrating  (13a)  with  respect  to 
hj  if  V  >  0,  and  y  and  e  are  PDS ,  then 


(mr) 


D(n|P,  I,  %,   v)  =  tl      %\\,  I,  %,   v)  oc  It+ll  2 


i(v+m) 


where 


Le  [n-p]v[n-p]''  = 


<iir£i)^(iir£i)*'  •••  (£n,-Pm)y(irPi)'' 


(14a) 


(14b) 


_(lr£l)^(l„,-£m)'  •••  (ilm-£m)^(llm-£m>'_ 
We  vi/ill  call  a  distribution  of  the  form  (14)  a  non-degenerate  generalized 
multivariate  Student  distribution. 


Proof;   We  integrate  h  over  the  region  R.  s  {h|h  is  PDS}.   If  v  >  0,  and  V  and  | 

are  PDS, 

D(n|E,  I,    I,  V)  =j4""'\n|g,  I,   h)  fi^'^hli,  v)d^ 

\ 
P^-^tv   h[(n-P)y(2-P)*^]  i^^ji  .  3-itr  h  I  |i^|iv-l  ^^ 


r  h{[n-p]y[ii-p]^+€)  u  |i(v+i)-i 


r^-itr  h{[n-p]v[n-p]'+€)  |j^| 


,  =  r        dh 
^h 


The  integrand  in  the  last  expression  is  the  kernel  of  a  Wishart  density  with 
parameter  ([g-P] V[n-P]'^  +  |,  v+1),  and  so 

D(n|E,  y,  |,  V)  oc   |[n-p]v[n-p]''  +  e|-i(v-Hn)    ^ 

proving  (14) . 

From  (13b)  it  is  obvious  that  the  marginal  distribution  of  h  is  Wishart  with 
parameter  (|,  v) . 

Tiao  and  Zellner  [  6  ],  have  shown  the  important  result  that  the  conditional 
distribution  of  n.  given  n.,    1  <  j  <  m,  j^i,  is  multivariate  Student,  and  that  the 
marginal  distribution  of  £.  is  also  multivariate  Student. 

1.3  Prior-Posterior  Analysis 

If  a  Normal -Wishart  distribution  with  parameter  (P',  V' ,  €*,  v')  is  assigned 

to  (n,  h)  and  if  a  sample  then  yields  a  statistic  (P,  V,  6,  v)  the  posterior 

distribution  of  (J,  ^)  will  be  Normal-Wishart  with  parameter  (g",  V",  |*",  n",  v") 

where 

(l  y"  is 

<  if 


-s  PDS 
V"  =  V'+y   ,   5"  =  <    if  ,  (16a) 

(_  0       otherwise 


V"  =  v'+v+m-+S'+6-6"-<D-l    ,  (16b) 

p"  =  [p'  y'  +  p  y]y""-^  (i6c) 


and 

'e'+e  +  g'^'g'"  +  P  V  P""  -  I'ly"!""  s  i"  if  e"  is  PDS 

-  -  =    "  =  "     "     "  (16d) 

0  otherwise 

When  e'  and  §  are  both  PDS,  the  prior  density  and  the  sample  likelihood  combine 
to  give  the  posterior  density  in  the  usual  manner.  As  was  the  case  with  the 
Multinormal  process  treated  in  [1],  when  either  e'  or  e  or  both  are  singular,  the 
prior  density  (13)  or  the  sample  likelihood  (7)  or  both  may  not  exist.   Even  in 
such  cases,  we  wish  to  allow  for  the  possibility  that  the  posterior  density  may 
be  well  defined.   To  this  end  we  define  the  parameter  of  the  posterior  density 
in  terms  of  e'  and  e  rather  than  e*"  and  e*. 

Proof;   Multiplying  the  kernel  of  the  likelihood  (7)  by  the  kernel  (13)  of  the 
prior  we  obtain 

g-^tr  h[n-p']y'[n-E]''  |h|is'  ^-^tr  I'h  |h|iv'-i 

.  e'^'"''  h(P-2)v(P-n)^  I^I^S  g4tr  h  |  ^  i(v-Kn-O-l) 
=  e'2^    |h|2^"  e'^'^'^  ^(£'+l>  |hji(v'+v-hn4S'-f«-S--$-l)-l 
where 

s  =  tr  h((p-n)v(p-n)'^  +  (2-p')v'(n-p')'^) 

=  tr  h{gyE'^-  g  Y|  -  g^s'^  +  S  YG^  +  2  ^'l^- E'V'i'^  -  n¥'g'^+ E'^'g'*^ } 
or  since  tr  h(g  V  ^^ }   =   tr  h{(2  V  g*^)"^}  =  tr  h{^   y  g*^}, 

s  =  tr  h(n  v'n*^  +n  vn*^  -  22(v'p'*^+y  g*^)  +  g'y'g'*^  +1^1^)    • 

Defining  y"=y'+Y  as  in  (16a),  g"t=Y"'l(V'P' ^^+7  P^)  as  in  (16c),  and  completing 
the  square,  gives 

S  =  tr  h{(n-P")y"(n-P")'^  +  P'V'P'*^  +  P  y  P*^  -  P"V"P"'^J   . 
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Defining  v"  as  in  (16b)  and 


0  otherwise 

we  have  (16d),  completing  the  proof. 

2.   Sampling  Distributions  with  Fixed  n 

We  assume  here  that  a  sample  of  size  n  is  to  be  drawn  from  an  (m  x  r) 
dimensional  reduced  form  process  as  defined  in  section  1  whose  parameter  (n,  h) 
is  a  random  variable  having  a  Normal-Wishart  distribution  with  parameter 
(t ,    V',  e' ,   v'). 
2.1   Conditional  Joint  Distribution  of  (P,  e|ll,  h^  V) 


The  conditional  joint  distribution  of  the  statistic  (P,  e)  given  that  the 


process  parameter  has  value  (IT,  h)  and  given  V  is,  provided  v  >  0, 

(17) 
2.1   Unconditional  Joint  Distribution  of  (P,  e) 


D(P,  lis,  h;  V,  V)  =  f^^^'^pli,  hap  f^'^^ljh,  V) 

=  fj'""\ils,  ha  V)  f^"\||h,  V) 


The  unconditional  joint  distribution  of  (g,  |)  as  regards  (n,  h) ,  provided 
V  >  0  and  Y^    is  PDS  is 

D(P,  ||g',  y',  l',  v'j  n,  V,  y) 

OC   = ^  (18a) 

ui-i'JYutp-r^S+i'F^"  ^ 

where 

..-1   ..-1  .-1  ...  -...-1        -1 

(18b) 
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Proof;   From  (13)  and  (17), 

D(|,  ||P'^  V',  e',  v'j  n,  V,  V) 

=  f[f4'"''\m,  h  a  pfJ"'^^n|P',h  ft  v')ds] 

•  fi"^(||h,  V)  f^"^h||',  v')dh  . 
The  inner  integral  is  f^'"'^^  (P  |  P' ,  h  ft  V^^)  .   For  since  both  H  and  |  =  p-n  are 
independent  Multinormal  random  variables,  it  follows  that  P  is  Multinormal  with 

mean  matrix 

E(P)  =  E(|)  +  E(|)  =  0  +  E(2)  =  I' 

and  variance-covariance  matrix 

V(|)  =  V(|)  +  V(n)  =  (b  ft  V)"^  +  (h  ft  V')"-^ 

=  (h"^  a  v'^)  +  (h"^  ft  y''b  =  h"^  a  (v"-^  +  v'"^)  . 

Consequently  the  matrix  precision  of  P  is 

[h"-^  ft  (v'-^+v'"-^)]'-^  =  h  ft  (v'^+v'"-^)'-^  =  h  a  V^j 

as 

v''-^+v"-^  =  v'-^(v'+v)v'"-^  =  v'-^  V"  v'"-*-  =  li^     . 

Hence  we  may  write  the  integral  above  as 

/  ^^^(pir^  V,,  h)  f^"^iih,  V)  f^'^hii',  v')dh 


Using  the  definition  (16b)  the  last  integral  is 


III*-'  / 


itr  h([g-E']^^[|-E']''+l+l'}    i^„_^ 


h|2^  '  dh 


Since  here  v  >  0,  V'''=V,  and  5"=1,  the  integrand  in  the  above  integral  is 
the  kernel  of  a  Wishart  density  with  parameter  ([P-P'jv  [P-P'J  +e+§',  v") ;  hence 
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the  kernel  of  the  distribution  of  (P,  e)  unconditional  as  regards  n  and  h  is  (18a). 
2.2  Unconditional  Distribution  of  P 


The  distribution  of  P  unconditional  as  regards  n,  h,  and  e  is,  provided  V 
is  PDS,  the  generalized  multivariate  Student  distribution  defined  in  (14).   That  is, 

whether  or  not  v  <  0. 

Proof:   From  (9),  the  kernel  of  the  marginal  likelihood  of  P  given  (n,  h)  is 

^-itr  b([i-n]v[p-n]'3  |h|i      .  '  "   "     (9) 

If  P  is  defined  as  in  (18)  when  V  <  0,  then  (9)  is  the  marginal  likelihood  of 
P  whether  or  not  v  <  0.   Conditional  on  h=h,  the  prior  distribution  of  ff  is, 
from  (13),  Multinormal  with  parameter  (P',  h  Q  V'). 

Consequently,  as  shown  in  the  course  of  the  proof  of  (18a),  the  distribution 
of  I  given  h=h  but  unconditional  as  regards  n  is  Multinormal  with  parameter 
(I' >    h  ^   V^)  where  Y^   ^^   defined  in  (18b).   This  implies  that,  the  distribution 
of  P  unconditional  as  regards  n,  e,  and  h  is 


// 


;5"'^P|P',  V^,  h)  f^"^S|h,  V)  .  f^'^hli',  v')dh  de 


:j^f^™"^p|p,  h  Q  v^)  f^^^hli',  v')dh 


QC 


I 


■itr  h{[P-p']V^[P-P']Ve'}    i(v'+l)-l 


^h 
The  integrand  in  the  integral  immediately  above  is  the  kernel  of  a  Wishart  density 

with  parameter  ([g-g' ]yut2-E' ]^+|' ,  v'+l).   Hence  (19)  follows. 
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2,3  Unconditional  Distribution  of  | 

The  distribution  of  e  unconditional  as  regards  P,  n,    and  K  provided  that 
V  >  0  and  e'  is  PDS  is 

a  non-standarized  generalized  inverted  Beta  distribution  with  parameter 

(^V-1,   ^(v'+r+m)-l,  I  as  defined  in  ( 9g  )  of  [1]. 

Proof;   The  marginal  likelihood  of  e  is  f^'^^Cilh,  v)  and  so  does  not  depend  on  n. 

The  marginal  prior  distribution  of  ^  is  ^w   vh||'^  v').   So,  when  v  >  0,  5"=1  and 

\ 

l^liv-lJ^-itrhU+i'}  1^  i(v-+l)-l  ^^ 


h 
OC 

^h 

Since  the  integrand  in  the  integral  immediately  above  is  the  kernel  of  a  Wishart 
density  with  parameter  (§+§',  v"-l),  (20)  follows  directly. 
2.4  Unconditional  Distribution  of  A  Sample  Observation  ^^■'  given  z}-^^ 

~(  i) 

Suppose  we  wish  to  make  a  probability  forecast  of  a  sample  observation  ^''-" 

before  the  value  it  assumed  is  observed,  but  knowing  z^-'    .      The  conditional 
distribution  of  ^    given  n,  h,  and  z^^^  is  simply  (3).   However,  if  we  regard 
n  and  h  as  jointly  distributed  random  variables  to  which  we  have  assigned  a 
Normal-Wishart  prior  distribution  with  parameter  (g',  V',  e',  v'),  then  the 
distribution  of  most  interest  to  us  is  the  distribution  of  ^    given  £    but 
unconditional  as  regards  (n,  h) .   This  distribution  is,  provided  v'  >  0,  and 
H  is  PDS, 
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D(Z^^^|E',  V',  e',  v'}  1,  z^J^)  =  f<'">(/J^|P'z^J\  Hy,  v')      (21a) 

where 

k^  =  1  -  z^J^   V""^  z^j^  =  |V"'^||V|   ,  (21b) 

Sy  =  v'k^(e'+g'[v'+y'V"'^Y'-k^Y]|''^)"'^ 
Proof:   We  prove  (21)  by  twice  completing  a  square  and  then  integrating.   For 
notational  simplicity,  we  drop  the  superscript  on  ^^^^  and  z^^^  throughout  the 
proof.   From  (3)  and  (13), 

D(z|P'^  V',  I',  v'}  1,  z}^h 


If 


\\ 

Dropping  constants  and  completing  the  square  in  n  in  the  exponent  of  the  integrand 
above  allows  us  to  write  this  expression  as  proportional  to 

r  [  re-i^"^  ^Ha-(z£'+rr)r'^3r[r<i^'+i'r)r''^]')|h|idn] 

^h   \  (21c) 

The  integrand  of  the  integral  in  square  brackets  is,  aside  from  a  multiplicative 
constant  not  involving  "^      ,   a  Normal  density,  so  that 

D(Z|P',  y',  I',  v'j  1,  z}^hccfe-^''^  ^   1  |h|^^'"^  dh 

where  B  is  the  (m  x  m)  matrix  in  curly  brackets  in  the  exponent  outside  the 
integral  in  square  brackets  in  (21c),   Since  the  integrand  of  the  integral 
immediately  above  is  a  Wishart  density  with  parameter  (B,  v'), 

D(z|i',  Y',  I',  v'j  1,  z^J^)  OC   lll"^^^''*"''^ 
which  is  the  kernel  of  a  multivariate  Student  distribution. 
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By  completing  the  s  quare  in  ^  in  B  we  may  write  this  kernel  as 


where 


k^l  zVVp'' 


$  =  I'  +  I'l'l'^  +  p'v'v""-^v'p'*^  -  k 

It  remains  to  be  shown  that  ^j,  =P'£.  and  that  H^  may  be  expressed  as  in  (21b). 


z%% 

y 


To  this  end  observe  that 


\r-'\ 


V" 


|v"-i||-z  zVl  =  Ir'^lly'i 


Thus 


^.y  z     - 

=  k 


k'^  z"   V""^[y"-z  z*^]p'' 


:\z'-iz'   r'   z)z']?j'   =   z'?j' 


It  follows  immediately  that 

I  .  l'  +  P'(V'+V'  V""^  y'-k^  V)!**^ 
and  that  H  is  as  defined  in  (21b). 

3.   Preposterior  Analysis  with  Fixed  n  >  0 

We  assume  that  a  sample  of  fixed  size  n  >  0  is  to  be  drawn  from  an  (m  x  1) 
reduced  form  data  generating  process  as  defined  in  section  1.   The  parameters 
(n,  h)  of  the  process  are  not  known  with  certainty  but  are  assumed  to  be  random 
variables  having  a  Normal-Wishart  prior  with  parameter  (P',  v',  |',  v')  where 
v'  >  0. 
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3.1  Joint  Distribution  of  (P",  e")  given  y 

Provided  v  >  0,  and  y*  is  PDS^  the  joint  distribution  of  (|",  |")  is 

|e"-e'-[P"-p']V*[P"-P']'^|2^''^ 

-       -     -       -     -  -  |^„|^(v"-kn-l) 

where 

f  =  V'  y'^  V"  =  V"  y'^  y'  (22b) 

and  the  range  of  |"  is  R^„  =  (|"  1 1"-| '  [F'-P*  ]V°[P"-?' J*^  is  PDSJ. 

Proof:   In  a  fashion  similar  to  that  used  in  establishing  (12-20b)  in  [2],  we 
can  show  that 

fM'^Hutl-i'^"  =  [?"-?']   [r-i']"  >  <23a) 

and  that  when  e"  is  PDS , 

I"  =  i'''l+i'^'i'*^"*'i  I  i^.-g«Y"l"^   =  i'+|+[|"-|']y°[?"-g']'^  .      (23b) 
From  (16b)  and  (16d)  we  have 

(p,  |)  =  ([E"y"-e'y']y''^    ,    e"-|'-[p"-p']v°[p"-p']'^)    .  (24) 

Letting  J(P",  e";  P^  e)  denote  the  Jacobian  of  the  integrand  transformation  from 
(P,  |)  and  making  this  transformation  in  (18a)  we  obtain 
0(1",  |"l|',  y',  iS  v';  n,  V,    y) 


CO 


\e"-e'-[P".pj]f[i".p'f\h-'^ 


li(V-Hn^l) '^(rMM,  i)  • 


Since  J(P",  e" ;    p,  e)=J(P",  P)  J(e",  e)  and  since  both  J(P",  £)  and  J(|",  e)  are 
constants  involving  neither  P  nor  e^  we  may  write  the  above  kernel  as  (22a). 

That  the  range  of  e"  is  as  shown  in  (19b)  follows  from  the  definitions  of 
P",  e"^  I,    and  e. 
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3.2  Some  Distributions  of  P"  and  e" 


The  distribution  of  P",  unconditional  as  regards  ff,  K,  and  |"  is,  provided 
that  Y°*V'V'-^V"  is  PDS , 

D(g"lgS  t>  i'>  "^  ^'  P  at  |[r'-?']v°[r'-i']^+i'r^^'''"^^  <25) 

Proof:   From  (19),  the  unconditional  distribution  of  P  is  generalized  multivariate 
Student  with  parameter  (P',  V  ,  e',  v'),  and  from  (16c), 

p..  =  [p'  V'  +  P  i]T'^   . 
Rewriting  the  above  as 

£"  =  £'(i  Q  V'  i"'^  +  p(i  a  y  T'^) 

it  is  easy  to  see  that  the  Jacobian  of  the  transformation  from  £  to  £"  is 

1  —        -1  -m 
|l  a  V  y"'  |2  =  |y  y"'  [^    ,    a  constant  as  regards  £".   Thus  substituting  according 

to  the  dictates  of  (23a)  into  (19)  yield  the  distribution  of  P"  as  shown  in  (25). 

The  distribution  of  §"  unconditional  as  regards  n,  h,  but  given  V  and  P", 

is  found  in  a  similar  fashion  to  be 

D(i"|P',  y',  i';  n,  V,  V,  P")g.  \%      I     III        E  y  g  +g  V  VJ       ^26) 

leM.p.y.g.t.p  y  pt^g„y„g„t|i(v"+m) 

with  range  set  R  ,,. 

4.   Analysis  When  Rank  (Z  ?})  <   r 


4.1   Inference  When  Rank  (Z  Z^)  <  r 

Even  when  the  rank  q  of  Z  Z  is 
inference  on  (n,  h)  by  appropriately  structuring  the  prior  so  that  the  posterior 
o"  (n,  h)  is  non-degenerate.   The  procedure  here  is  analagous  to  that  suggested 
in  section  3.3  of  [  5  ]. 
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For  example,  suppose  the  data  generating  process  is  that  of  section  1.1,  we 

assign  a  prior  on  (g,    h)  with  parameter  (0,  0,  0,  0),  and  then  we  observe  a  sample 

(^(^\z^^b,  (2^^^, z(^) ),...,  (^^'^^^z^"^)  where  n  <  m.   The  posterior  distribution 

of  (H;  ?)  is  degenerate  in  this  case,  as  the  posterior  parameters  (   16  )  assume 

values 

v"=v+m-<l>-l=0   ,   P"=p   ^   y"*=0   ,   e"*=:0   . 

If,  however,  we  require  the  prior  on  (n,  K)  to  be  very  diffuse,  but  are  willing  to 

introduce  just  enough  prior  information  to  make  v"  >  0,  V"  PDS,  and  e"  PDS,  then  a 

non-degenerate  posterior  will  exist;  e.g.  assign  v'=l,  V'=M  I,  M»  0,  and  €'=K  I, 

K  »  0,  so  that  v"=l,  V"=M  I  +  V,  and  |"=K  I  +  |.   The  posterior  on  (n,  h)  is  then 

non-degenerate  Normal-Wishart  with  parameter  (|",  V",  e",  v")  =  (P,  M  I+V,  K  I+€,  1). 

Notice  that  if  m  <  q  <  r,  |  is  PDS,  so  that  the  posterior  will  be  non-degenerate 
even  if  |'=0,  so  long  as  v"  >  0  and  ^"  is  PDS. 

To  see  that  in  fact  a  prior  on  (2,  K)  with  parameter  (0,  MI,  K  I,  1)  is 
extremely  diffuse,  we  state  the  following  result  proved  by  Martin  [  4  ],  who  gives 
explicit  formulas  for  calculating  the  means,  variances,  and  covariances  of .elements 
of  the  variance-covariance  matrix  h   : 

Theorem:   If  K  is  (m  x  m)  and  Wishart  distributed  with  parameter  (e,  v)  then 

~-l  - 1  ^.1 

E(b  )  =  (l/v-2)e  if  V  >  2;  and  letting  h   denote  the  (a  P)th  element  of  h   ,  for 

1  <  a,  P,  7,  5  <  m 

^°"^^<^^  \l^   =  3(v-2)(v-4)  ^^TT^   ^ap  %5  •"  'av  ^p6  ■*■  ^a6  W 
if  V  >  4. 

Since  in  this  example  v"=l,  it  is  easy  to  show  that  none  of  the  above  moments 

exist.   If  we  had  specified  v'=5,  say,  then  first  and  second  moments  of  h   would 

—  1    M 
exist  and,  be  equal  to  E(h   )  =  T  I  and 
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^°^%P'  Sb    =  ^K?   ^6  ■"  'ci5  H?'^      >        1  <  a,  P,  7,  5  <  m 
/=k  and  0  if  t^k   for  1  <  /,  k  < 
It  is  also  important  to  observe  that  as  the  projection  of  each  ^. ,    i=l,2,...,m 
on  the  q-dimensional  row  space  of  Z  is  unique,  arbitrary  specification  of  (r-q)m 
elements  of  g  as  equal  to  0  does  not  influence  the  values  assumed  by  the  posterior 
parameters.   That  is,  none  of  the  posterior  parameters  P",  V",  e"  and  v"  depend 
on  which  (r-q)m  particular  p^^s  are  set  equal  to  zero. 

4.2  Probabilistic  Prediction 

If  when  q  <  m  we  assign  a  prior  to  (n,  h)  with  parameter  (0,  MI,  K  I,  1),  then 

by  (21)  the  unconditional  distribution  of  the  next  sample  observation  given  z^"^^ 

is  non-degenerate  multivariate  Student  with  parameter  (0,  k  K  I,  1)  where 

t  ^      ~ 

k^.  =  1"£      (M  I  +  y)'  z^"  K      Notice  that  setting  0  <  v'  <  2  as  we  have  done 

here  implies  that  the  second  moment  of  ^      given  z^"  ^  does  not  exist,  so 

that  the  unconditional  distribution  of  ^      is  extremely  diffuse  in  this 

particular  example. 
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